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Abstract 

Background: Streptococcus is an economically important genus as a number of species belonging to this genus 
are human and animal pathogens. The genus has been divided into different groups based on 16S rRNA gene 
sequence similarity. The variability observed among the members of these groups is low and it is difficult to 
distinguish them. The present study was taken up to explore 16S rRNA gene sequence to develop methods that 
can be used for preliminary identification and can supplement the existing methods for identification of clinically- 
relevant isolates of the genus Streptococcus. 

Methods: 16S rRNA gene sequences belonging to the isolates of S. dysgalactiae, S. equi, S. pyogenes, S. agalactiae, 
S. bovis, S. gallolyticus, S. mutans, S. sobrinus, S. mitis, S. pneumoniae, S. thermophilus and 5. anginosus were analyzed 
with the purpose to define genetic variability within each species to generate a phylogenetic framework, to 
identify species-specific signatures and in-silico restriction enzyme analysis. 

Results: The framework based analysis was used to segregate Streptococcus spp. previously identified upto genus 
level. This segregation was validated using species-specific signatures and in-silico restriction enzyme analysis. 43 
uncharacterized Streptococcus spp. could be identified using this approach. 

Conclusions: The markers generated exploring 16S rRNA gene sequences provided useful tool that can be further 
used for identification of different species of the genus Streptococcus. 

Keywords: phylogenetic framework, signature sequences, genetic heterogeneity 



Background 

The genus Streptococcus consists of spherical Gram 
positive bacteria belonging to the class Bacilli and the 
order Lactobacillales [1]. The group is large and com- 
prises of numerous clinically significant species which 
are responsible for wide variety of infections in human 
and animals. Streptococcus of different groups are 
known to cause human diseases, some species being 
highly virulent and responsible for major diseases. Spe- 
cies like S. pyogenes, S. agalactiae and 5. pneumoniae 
are important as they cause serious acute infections in 
man, but several other species are also involved in a 
number of diseases like infective endocarditis, abscesses 
and other pathological conditions [2]. Various species of 
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Streptococcus are known to be associated with infections 
of catties, pigs, horses, sheeps, birds, aquatic mammals 
and fishes [3]. The genus has undergone considerable 
taxonomic revisions and has been divided into different 
groups (pyogenic, anginosus, mitis, mutans, salivarius, 
bovis) based on 16S rRNA gene sequence similarity [4]. 

Since many species belonging to the genus Streptococ- 
cus are associated with various pathological conditions, 
different protocols have been used for their identifica- 
tion. Still precise identification of these species is labor- 
ious. Clinical laboratories use serological grouping by 
Lancefield, haemolytic reactions and phenotypic tests for 
identification of various Streptococcus isolates. However, 
these Lancefield groups are not species-specific [5,6] 
and haemolytic activity differs within species and 
depends on incubation procedures. Strains within a 
given species may differ for a common trait [7,8] and 
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even the same strain may exhibit biochemical variability 
[9,10]. 

Various alternatives have been employed for identifi- 
cation of Streptococcus isolates. These include DNA 
hybridization [11-14]; rDNA restriction analysis [15-17]; 
use of 16S-23S rRNA interspacer region [18-21], D-ala- 
nyl-D-alanine ligase gene (ddl gene) [22]; autolysin gene 
(lytA) [23]; dextranase (dex) [24]; heat shock protein 
(groESL) [25,26]; RNA subunit of endoribonuclease P 
{rnpB) [27]; the elongation factor Tu (tuf) [28]; gyrase A 
(gyrA) and topoisomerase subunit C {parC) [29]; 
sequence analysis of small subunit rRNA [30]; manga- 
nese-dependent superoxide dismutase gene (sodA) 
[31,32]; recombination and repair protein (recN) [33]; 
tDNA PCR fragment length polymorphism [34,35] and 
use of multilocus sequencing typing loci [36,37]. 

16S rRNA gene sequencing has proved to be one of 
the most powerful tools for the classification of microor- 
ganisms [38] and has been used for identification of 
clinically relevant microbes [39,40]. Therefore, molecular 
tools based on 16S rRNA gene can be developed and 
used for identification. However it is also true that the 
correct identification of bacterial species may not be 
based on the nucleotide sequence of a single gene. Mul- 
tilocus Sequence Analysis (MLSA) of several house- 
keeping genes has to be performed. But from practical 
standpoint there is need for a simplified approach for 
preliminary identification of a species, particularly under 
the conditions if the amount of isolated DNA is not 
enough for MLSA or it does not react with a complete 
set of typing primers. The current work considers the 
possibility to use 16S rRNA sequences for this purpose 
and is useful for practical applications. Thus the present 
study aims to explore internal features of 16S rRNA 
gene for preliminary identification of a species that can 
supplement the existing methods for identification. 
These methods include construction of phylogenetic fra- 
mework, identification of species-specific signatures and 
restriction enzyme analysis. 

Methods 

Sequence data 

16S rRNA gene sequences belonging to the genus Strep- 
tococcus from RDP database http://rdp.cme.msu.edu/ [41] 
were analysed in the present study. These included the 
sequences with relatively high number of identified 
organisms (86 sequences belonging to isolates of S. dys- 
galactiae, 61 to S. equi, 61 to S. pyogenes, 29 to S.agalac- 
tiae, 31 S. bovis-equinus (S. bovis and S. equinus are 
considered to be a single species [42]), 76 to S. gallolyti- 
cus, 102 to S. mutans, 23 to S.sobrinus, 28 to S. mitis, 41 
to 5. pneumoniae, 73 to S. thermophilus, 32 to 5. angino- 
sus) and 63 sequences of uncharacterized species identi- 
fied only upto genus level. The sequences belonging to 



twelve sets of Streptococcus species occurring with higher 
frequency were used as the master species set for gener- 
ating a phylogenetic framework, species-specific signa- 
tures and restriction enzyme analysis. 

Phylogenetic Analyses 

For phylogenetic analyses, the sequences were aligned 
using multiple alignment program CLUSTAL X [43]. 
Evolutionary distances between all the sequences were 
calculated with DNADIST of the PHYLIP 3.6 package 
[44]. The program NEIGHBOR was used to draw neigh- 
bor joining [45] tree with Jukes and Cantor correction 
[46]. Statistical testing of the trees was done using SEQ- 
BOOT by resampling the dataset 1000 times. The trees 
were viewed through TreeView Version 1.6.6 [47]. For 
each of these 12 Streptococcus species data sets, 
sequences that formed a single cluster were aligned and 
a consensus was obtained by using JALVIEW sequence 
editor [48]. The sequence close to consensus from each 
group was chosen as a representative for that particular 
group. Based on this, a reference set of 63 sequences 
was selected to define the range of genetic variability 
present in each of the Streptococcus species. 

Specific Signatures 

Signatures were identified in each of the species data set 
using online MEME program [49]. Sequences of 12 
Streptococcus species data sets were submitted group- 
wise in MEME program Version 4.6.1 http://meme.sdsc. 
edu/meme4_6_l/cgi-bin/meme.cgi. In order to obtain 
maximum number of motifs the default setting was 
modified from 3 motifs to 20 motifs. The default value 
of motif widths was also modified and re-set between 25 
and 50. Each of the 20 signatures was checked for its 
frequency of occurrence among a particular Streptococ- 
cus sp. The signatures which did not appear in other 
Streptococcus spp. were considered as unique. BLAST 
search against NCBI database http://www.ncbi.nlm.nih. 
gov/ was carried out for these signatures to check their 
uniqueness. 

Restriction enzyme analysis 

Eleven Type II Restriction enzymes (Table 1) were con- 
sidered for these analyses. Restriction Mapper Version 3 
http://restrictionmapper.org/ was used to obtain the 
restriction pattern of the 12 Streptococcus species data 
sets employed for construction of phylogenetic frame- 
work. These restriction patterns were analyzed and a 
consensus pattern was determined for each species. 

Cluster analysis for restriction profile 

For cluster analyses MVSP (Multi Variate Statistical 
Package, Kovach Computing Services) version 3.13p was 
used. Dendrograms were constructed using the 
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Table 1 Restriction enzymes used in the present study 



S.No. 


Restriction enzyme 


Cut site 


1 . 


Alu\ 


AG CT 


2. 


BamH\ 


G GATCG 


3. 


Bfa\ 


C TAG 




EcoR\ 


G T AATTC 


5. 


Haelll 


GG'CC 


6. 


Hha\ 


GCG'G 


7. 


Hind\\\ 


A'AGCTT 


8. 


Msp\ 


G'GGG 


9. 


Rsa\ 


GT'AC 


10. 


Sau3M 


T GATC 


11. 


Sma\ 


CCG'GGG 



restriction patterns generated by different restriction 
enzymes for 12 framework species. The dendrograms 
show the utility of these enzymes in distinguishing dif- 
ferent strains. 

Results 

In the present study, 16S rRNA gene sequences belong- 
ing to 12 different species from the genus Streptococcus 
were analyzed with the aim to construct phylogenetic 
framework, identification of species-specific signatures 
and restriction enzyme analysis. 

Phylogenetic framework 

Phylogenetic tree (Additional file 1: Fig. SI) based on 61 
sequences of S. pyogenes revealed 8 clusters. 8 sequences 
representing these 8 clusters were chosen. These 
sequences could represent genetic heterogeneity present 
within this species. Similarly, sequences from other 
Streptococcus spp. were analysed for genetic heterogene- 
ity present within them (Additional file 2, 3, 4, 5, 6, 7, 8, 
9, 10, 11, 12: Fig. S2-S12). Different representative 
sequences were choosen from each species that could 
provide information regarding the range of genetic 
variability present within each species (Table 2). 63 such 
representative sequences were selected for framework 
construction (Figure 1). Strains of all the species were 
clearly segregated except for S. pneumoniae and S. mitis 
suggesting a high level of similarity between strains of 
these two species, making their identification difficult 
solely on the basis of 16S rRNA gene. 
The framework generated was then used to check if 
uncharacterized Streptococcus spp. can be classified 
among the framework species (Figure 2). Out of 63 
sequences previously identified upto genus level, 43 
were found to segregate with 7 Streptococcus framework 
species, supported by high bootstrap values. Among 
these 43 sequences, 3 segregated with S. anginosus, 21 
with S. mitis, 6 with 5. pneumoniae and S. gallolyticus 



each, 5 with S. bovis-equinus, 1 with S. dysgalactiae and 
5. thermophilus each (Figure 2). No strains could be seg- 
regated with 5. mutans, S. pyogenes, S. equi, S. agalac- 
tiae and S. sobrinus. The framework based segregation 
was further validated by checking for the presence of 
species-specific signatures and restriction analysis. 

Signature sequences 

Out of 20 signatures identified for each of the 12 Strep- 
tococcus framework species, only 1-5 unique signatures 
were found (Table 3) in framework species. The unique 
signatures were found in: S. mutans: 5; S. dysgalactiae: 
3; S. equi, S. sobrinus and S. thermophilus: 2 each; S. gal- 
lolyticus, S. agalactiae, S. pyogenes, S. bovis-equinus, S. 
anginosus and S. pneumoniae: 1 each. These signatures 
were found to occur with high frequency. Moreover, 
these signatures were also found to be highly conserved 
across a particular species showing 98-100% sequence 
identity but were found to be fragmented in other spe- 
cies. No unique signature could be identified for S. 
mitis. But S. mitis can still be distinguished from S. 
pneumoniae using the signature found unique to S. 
pneumoniae. In S. mitis the signature was found to be 
substituted at specific positions and thus can distinguish 
these two species (Table 3). The signature found for S. 
bovis-equinus was effective in distinguishing it from very 
closely related species like S. lutetiensis and S. gallolyti- 
cus. These signatures were further used to validate the 
segregation of 43 sequences among 7 different frame- 
work species. All 43 sequences were found to contain 
the signature unique to the particular species thus vali- 
dating the affiliation of these sequences to a particular 
species. 

Restriction enzyme analysis 

In-silico restriction enzyme analysis using eleven type II 
enzymes revealed different patterns. Restriction sites for 
Alul, Bfal, Haelll, Mspl, Rsal and Sau3Al occurred with 
frequency of 3-10 resulting in 4-11 fragments. The sites 
for enzymes £coRI, Sma\ and Hhal were found in 
majority of sequences studied but they were found to be 
less frequent cutters producing single, single and double 
cuts respectively. These enzymes thus are less informa- 
tive and serve no purpose. Inspite of low frequency, 
BamHl and Hindlll can still be used for distinguishing 
different Streptococcus spp. BamHl produced single but 
unique cut in S. thermophilus and can be used to distin- 
guish S. thermophilus isolates. Hindlll produces single 
but unique cut in S. sobrinus and can be used to distin- 
guish S. sobrinus isolates. As can be seen from the den- 
drograms (Additional file 13, 14, 15, 16, 17, 18: Fig. S13- 
S18) different restriction enzymes can be used for iden- 
tification and distinguishing different isolates. While 
Alul was found to distinguish majority of Streptococcus 
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Table 2 Sequences used for generating phylogenetic framework 



Species 


No. of 

sequences 

used 


No. of clusters 
obtained 


No. of representative 
sequences 


Accession number of representative sequences 


S. 

dysgalactiae 


86 


5 


5 


AB002487, EU660339, AB002511, AB1 59678, EU075068 


5. equi 


0 I 


r 

0 


r 
J 


tr4U0UUz, rlVlzU4oo3, tr4UoUiy, AdUUZjIO , AJoUj/4o 


S. pyogenes 


0 I 


Q 
O 


Q 

o 


rJ/yo/4U / rJoozojz, rJOOZo4U, AdUUzdzI, AtUU4Uyz, rJOOZo4D / 
CP000003, EU660342 


5. agalactiae 


29 


4 


5 


Ac/icn/i3"i ADOT3C7/1 Acnicm7 adi 7cri3 7T venn^ n 
Ar43y4Jz y AdUzJd/4, ArU I b^Z/, Ad I oUo/ , AbyUDZ 


5. bovis- 


31 


3 


3 


AF104109, AJ305257, DQ148956 


equinus 










s. 

gallolyticus 


76 


5 


5 


AF104114, EU163502, EU163482, EU163499, AF459431 


S. mutatis 




0 


0 


nPi^777^0 n("i^77750 Pi/^1A77777 ACH1/I133 AC13Q£flfl nf~lA777/1 3 

ULJO/ //jy, UUo/ / / ULjD// / //, AtU 1 4 1 jj, Ar 1 jyoUU, UtjO///4o, 


^ snhrinn*: 


23 


4 


/( 


AR2Q47^1 00677789 DO677790 D0677798 


S. mitis 


28 


7 


9 


AF003929 1 , AJ295853, EU200182, AM1 57440, DQ232533, AM157420, 
AY281076, AB002520, AY281078 


S. 


41 


3 


4 


AE008386, CP001015, AJ617796, AY525795 


pneumoniae 










S. 

thermophilus 


73 


3 


4 


EF990662, X68418 T , EU419603, FJ749326 


S. anginosus 


32 


5 


5 


AY986762, AF145239, AF104678 7 , AY986764, DQ232517 


Total 


643 


59 


63 





Type strains are denoted by "T" as superscript. 



framework species, Mspl produced maximum numbers 
of cuts and thus proved informative for such analysis. 
Closely related species like 5. dysgalactiae and 5. agalac- 
tiae can be distinguished using Bfal, Mspl and Haelll. 
Similarly, S. gallolyticus and S. bovis can be distin- 
guished using Bfal and Haelll. Therefore a combination 
of Alul, Bfal, Mspl or Haelll can be used for distin- 
guishing closely related organisms. The sequences segre- 
gated with framework species were further validated 
using w-si'/z'corestriction enzyme analysis. The identified 
sequences showed unique restriction enzyme pattern 
close to the nearby framework species (Table 4) again 
validating the framework based segregation. Thus com- 
bining the information from framework, signature 
sequences and restriction enzyme analysis it was possi- 
ble to identify 43 sequences (out of 63) upto species 
level which were previously designated as Streptococcus 
sp. (Table 4). 

Discussion 

Streptococcus is a clinically important genus as a num- 
ber of species belonging to this genus are human and 
animal pathogens. This genus has undergone consider- 
able taxonomic revisions and has been divided into dif- 
ferent groups based on 16S rRNA gene sequence 
similarity. 

The present study aims to explore internal features of 
16S rRNA gene sequences of different Streptococcus spp. 



to develop methods for their identification. A phyloge- 
netic framework was constructed using different repre- 
sentative sequences followed by identification of 
signature sequences and restriction enzymes analysis. 
The framework based analysis suggests a high level of 
genetic heterogeneity present within different Streptococ- 
cus spp. Signature sequences specific for each Strepto- 
coccus framework sp. were identified. These signature 
motifs would be simple to use as a supplement to the 
automated identification process. Restriction analysis has 
proved to be an important tool to identify newly isolated 
strains [50-52] and can be exploited for describing new 
species [53]. Multiple restriction enzyme usage is 
recommended for better resolution. Although it has 
been documented that closely related species cannot be 
distinguished solely on the basis of 16S rRNA gene, but 
exploring the internal features of this gene can be of 
definite use. Therefore researchers are now looking to 
explore the unique features of 16S rRNA gene that have 
not been explored yet [54,55]. As already described the 
genus Streptococcus has been divided into different 
groups based on 16S rRNA gene similarity [4]. The fra- 
mework species used for these analyses belong to these 
different groups. 

Four framework species- S. pyogenes, S. agalactiae, S. 
equi and S. dysgalactiae belong to pyogenic group 
which is the largest group of the genus Streptococcus. S. 
pyogenes, S. agalactiae, S. equi and 5. dysgalactiae can 
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560 



1000 



42: 



348 



-S. agalactiae 1 (AB175037) 
-5. agalactiae (AB023574) 



10017 



■ S. agalactiae(AF0l5927) 



■ S. agalactiae (AF459432) 



951 



1000 



S. agalactiae (X59032) 

S. dysgalactiae subsp. dysgalactiae (AB 159678) 
5. dysgalactiae subsp. equisimilis (EU075068) 



1000 



735 



983 



821 



S. dysgalactiae subsp. dysgalactiae (AB0025 1 1 ) 
S. dysgalactiae subsp. dysgalactiae (EU660339) 
5. dysgalactiae subsp. dysgalactiae (AB002487) 

__ r r r , S. pyogenes (F J662845) 

Ld^i S pyogenes (CP000003) 



1000 



997 



797 



867 



472 



759 



■ 5. pyogenes (FJ662832) 



■ S. pyogenes (EU660342) 



- S. pyogenes (FJ662840) 



210 



1000 



1000 



S. pyogenes (FJ798740) 

- S. pyogenes (AE004092) 
. S. pyogenes (AB002521) 

■ S. equi subsp. equi (EF406002) 

■ S. equi subsp. equi (FM204883) 



860 



998 



309 



906 



5. equi subsp. zooepidemicus 1 (AB002516) 
S. equi subsp. ruminatorum T (AJ605748) 
5. equi subsp. zooepidemicus (EF406019) 

S. gallolyticus subsp. gallolyticus (EU 163482) 
S. gallolyticus subsp. gallolyticus (AF1041 14) 



"T37 - 



624 



- S. gallolyticus subsp. gallolyticus (EU 1 63499) 



■ S. gallolyticus subsp. pasteurianus (EU163502) 



1000 



1000 



517 



304 



354 



^ 495 



S. gallolyticus subsp. macedonicus (AF459431) 

S. equinus (AJ305257) 

S. equinus (DQ148956) 

S. 6ovk(AF104109) 

■ S. mutan.? (AE014133) 



1000 

7s* 



974 



344 



5. mutans (DQ677777) 

■ S. mutans (DQ677739) 

S. mutans (DQ677759) 

■ S. mutans (AF139600) 



1000 



S. mutans (DQ677743) 

p— , 5. sobrinus (DQ677790) 

"^P 5. sobrinus (DQ677789) 



-S. sobrinus (DQ677798) 



311 



1000 



636 



838 



-S. sobrinus (AB294131) 

— 5. thermophilus (FJ749326) 

— S. thermophilus (EU4 19603) 



979 



■ S. thermophilus (EF990662) 
- S. thermophilus T (X684 1 8 ) 

. S. anginosus (AY986762) 

~^~ 2 Li^l s. anginosus 7 (AF104678) 



1000 



■ 5. anginosus (AF145239) 



■ S. anginosus (AY986764) 



-S. anginosus (DQ232517) 



1000 



493 



458 



996 



727 



63 



346- 



765 



351 



307 



572 



— 5. pneumoniae (CP00 1015) 

— S. pneumoniae (AE008386) 

■ S. mitis (AJ295853) 

■ S. pneumoniae (AJ6 17796) 



■ S. mitis (DQ232533) 



■ S. mitis T (AF003929) 

S. mitis (AM157420) 

■ S. mitis (AM 15 7440) 



959 



-5. mitis (AY281078) 



-S. mitis (AY281076) 



■ S. mitis (EU200U2) 

■ S. pneumoniae (AY525795) 



• E. coli (X80725) 



- S. mitis (AB002520) 



100 

Figure 1 Phylogenetic tree based on 63 representative 16S rRNA gene sequences from 12 Streptococcus species The tree was 
constructed by neighbour-joining method with Jukes and Cantor correction. The numbers at node represent bootstrap values (based on 1000 
resampling). The accession numbers are shown in parenthesis. 
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(EU163499) 




mitans (DQ677743) 
-ttt. — S. sobrimis (DQ677790) 
S. sabrinus (DQ677789) 

- Streptococcus sp. (AJ87 1191) 

- Streptococcus sp. (Y07601) 
-Streptococcus sp. (FN377818' 
-Streptococcus sp. (FN377819' 



Figure 2 Phylogenetic tree of 63 framework sequences (bold values) and uncharacterized Streptococcus spp. The tree was constructed 
by neighbour-joining method with Jukes and Cantor correction. The numbers at node represent bootstrap values (based on 1000 resampling). 
The accession numbers are shown in parenthesis. 
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Table 3 Unique signature sequences identified for 12 Streptococcus framework spp 


Streptococcus sp. 


Unique Signature (Nucleotides) 


Frequency of occurance 


S. dysgalactiae 


AATACA(G)TGCAAGTAGAACGCTGAGGACTGGTGCTTGCACCGGTCCAAGGA (52-1 01 ) 


77/86 




TGCATCACTATGAGATGGACCTGCGTTGTATTAGCTAGTTGGTGAGGTAA (223-272) 


86/86 




CATTTAAAAGGTGCAATTGCATCACTATGAGATGGACCTGCG (207-246) 


86/86 


S. equi 


AAGAAGGTTTTCGGATCGTAAAGCTCTGTTGTTAGAGAAGAACAGTGATG (420-469) 


61/61 




AAAGTGCATCATGTGAGGGTAACTAACCAGAAAGGGACGGGTAACTACGT (476-525) 


61/61 


S. pyogenes 


AAACGATAGCTAATACGGCATAAGAGAGACTAACGCATGTTAGTAATTTA (1 72-222) 


56/61 


S.agalactiae 


GGAGTGGCTTAAGCATTGTACGCriTGGAAACTGGAGGACTTGAGTGC (61 3-660) 


29/29 


S. bovis 


GCNTTTAACNCATGTTAGPuNGCTTGAAPuGPuAGCAA (1 78-21 2) 


31/31 


S. gallolyticus 


TCTTGACATCCCGATGCTATTTCTAGAGATAGAAAGTTTCTTCGGAAGAT (991 -1 040) 


76/76 


S. mutans 


AGTAAAAGGGTATGGCTCAACCATAGTGTGCTCTGGAAACTGTCTGACTT (61 1-660) 


102/102 




ACCTGGGCTACACACGTGCTACAATGGTGGGTACAACGAGTTGCGAGCCG (1 222-1 271 ) 


102/102 




ATGATAATTGATTGAAAGATGCAAGCGCATCACTAGTAGATGGACCTGCG (1 97-246) 


102/102 




ACTAGTAGATGGACCTGCGTTGTATTAGCTAGTTGGTAAGGTAAGAGCTT (228-277) 


102/102 




AACACACTGTGCTTGCACACGGTG I MIL I GAGTCGCGAAGGGGTGAGT (70-1 19) 


102/102 


S.sobrinus 


TCACACCAGGAGAGTTTGTAAGACCCAAAGTGGGTGAGGTAACCATITAT (141 7-1466) 


23//23 




AAGTGGAACGCATTGGTAACACCGGACTTGCTCCAGTGTTACTAATGAGT (54-1 03) 


23/23 


S. mitis 


TPuPyGCATGACPyANPyTNNNTTPuAAAGGTGCANTTGCAPyCACTANNAGATGGA (228-277) 


28/28 


S. pneumoniae 


TGTTGCATGACATTTPuCTTAAAAGGTGCANNTGCATCACTACCAGATGGA (1 79-228) 


41/41 


S. Thermophilus 


ACAATGGTTGGTACAACGAGTTGCGAGTCGGTGACGGCGAGCTAATCTCT (1 246-1 295) 


73/73 




AAGATGGACCTGCGTTGTAmGCTAGTAGGTGAGGTAATGGCTCACCTA (234-283) 


73/73 


S. anginosus 


ATTTATTGGGCGTAAAGCGAGCGCAGGGGGTTAGAAAAGTCTGAAGT (581 -530) 


32/32 



The signatures were found to 98-100% identical and conserved in a particular species. Py denotes pyrimidine, Pu denoted purine and N denotes any nucleotide. 
Overlapping signatures are shown in bold.The number in brackets indicate the corresponding position of the signature in 16S rRNA gene. 



be distinguished on the basis of specific signatures 
(Table 3) as well as using different restriction enzymes 
(Additional file 13, 14, 15, 16, 17, 18: Figures S13-S18). 

Two framework species, S. pneumoniae and S. mitis 
belong to mitis group. Identification of members of mitis 
group, particularly S. pneumoniae is problematic. Identi- 
fication of S. pneumoniae isolates is usually done using 
serological [56,57] and molecular techniques [58-60]. S. 
pneumoniae isolates can be easily identified using the sig- 
nature sequence as given in Table 3. We could easily dis- 
tinguish S. pneumoniae isolates from S. mitis using this 
signature sequence. Other members of this group (S. 
mitis and S. oralis) that are almost indistinguishable on 
the basis of complete 16S rRNA gene sequence can be 
differentiated using different restriction enzymes. S. mitis 
can be distinguished from other two species of this group 
by using enzyme Sau3Al. S. pneumoniae, S.mitis and S. 
oralis can be distinguished from each other by exploiting 
Alu\ and Mspl (data not shown for S. oralis). 

Framework species S. anginosus belongs to anginosus 
group. This group consists of only 3 species {S. angino- 
sus, S. intermedius and S. constellatus) . Members of 
anginosus group are also difficult to identify and distin- 
guish. An identification scheme for differentiation of 
these 3 strains was proposed by Whiley et al. [61] and 
Whiley and Beighton [62]. Commercial identification 



systems [63,64] and molecular methods have been used 
for identifying and distinguishing these three species 
[65,14]. S. anginosus can be easily identified using the 
signature sequence (Table 3) and use of restriction 
enzymes. Restriction enzymes Alul, Bfal, Rsal and 
Haelll can be used for distinguishing members of angi- 
nosus group efficiently (data not shown for S. interme- 
dius and S. constellatus). 

Framework species, 5. thermophilus belongs to salivar- 
ius group which is closely related to bovis group [66] 
and consists of only 3 species {S. salivarius, S. vestibu- 
laris, S. thermophilus). S. thermophilus can be identified 
by using unique signature sequence (Table 3). 

Two framework species, S. bovis and S. gallolyticus 
belong to bovis group. Members of bovis group, S. bovis 
and S. gallolyticus are difficult to identify. Isolates of 
these two species can be distinguished using the signa- 
ture specific for S. gallolyticus and S. bovis. Moreover, 
the use of restriction enzymes Alul, Bfal and Haelll can 
be instrumental in distinguishing them. The signature 
found for S. bovis was found to be efficient in distin- 
guishing S. bovis from closely related species-5. lutetien- 
sis. These two species are difficult to distinguish solely 
on the basis of 16S rRNA gene. 

Framework species S. mutans and 5. sobrinus belong 
to mutans group. These two species are also difficult to 
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Table 4 Streptococcus spp. identified upto species level: 



S. No. 


Acccession no. 


Close framework species 


Presence of species-specific signatures 


Unique RE pattern 


1 


EF015474 


S. dysgalactiae 


Y 


5. dysgalactiae (Mspl) 


2 


AF221604 
DQ471794 










AF298198 


5. gallolyticus 


Y 


5. gallolyticus (Bfa\) 




AB295598 










AF084834 










AF298197 








3 


AF349932 


S. thermophilus 


Y 


S. thermophilus [Alul, Hae\\\ Sau3A\,BamH\) 


4 


X78826 










X78825 


5. anginosus 


Y 


S. anginosus (Alul, Bfa\, Hae\\\) 




AY049738 








5 


AF316591 
AF3 16596 










AF316593 


5. pneumonaie 


Y 


S. pneumonaie (Alul, Mspl) 



AF316594 
AF316595 
EF151147 



6 AF316592 
EF151145 
AF385523 
FJ405281 
AY880050 
AY880051 
AF385526 
AB038371 

EF151150 S.mitis Y S. mitis (Alul, Mspl, SaulAl) 

AY005041 

AF479579 

AY005040 

AY1 34908 

AY741062 

AY741061 

AJ871182 

AF385525 

AB1 74792 

AB1 74791 

AY005047 

AF543289 

7 AJ937757 5. bovis-equinus Y S. bovis-equinus [Bfa\) 
GQ1 39522 

AF298199 
FJ611789 
FJ611790 



distinguish. Beighton et al. (1991) [8] provided a scheme 
for identification of S. mutans and S. sobrinus strains. In 
the present investigations these two can be easily distin- 
guished using species-specific signature and use of 
restriction enzymes- Alul, Bfal, Haelll, Mspl, Rsal and 
5aw3AI. 



Conclusions 

The species that are difficult to distinguish solely on the 
basis of 16S rRNA gene sequence can be identified 
using inner secrets of 16S rRNA gene, the signatures. 
These signatures can be exploited for quick identifica- 
tion. The aim of phylogenetic framework construction is 
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to define a range of genetic variability within the species 
and later exploiting this variability for identification of 
different isolates. Similarly, use of restriction enzymes 
help in generating markers that can distinguish closely 
related species. Present study reveals that the frame- 
work, use of specific signatures in 16S rRNA gene and 
pattern generated by different restriction enzymes can 
be exploited for identification of isolates belonging to 
the genus Streptococcus. The markers generated in the 
present study are based on 16S rRNA gene sequence 
which is conserved and neither subjected to changes 
due to culture conditions nor exhibit biochemical varia- 
bility. Thus the scheme proposed can be applied to any 
isolate. The approach is cost effective and rapid way for 
identification of various isolates and thus can be used to 
differentiate isolates that are difficult to distinguish due 
to very close traits and biochemical features. Addition- 
ally the approach is simple for preliminary identification 
of a species and can supplement existing automated 
identification processes. But we should keep in mind 
that this is a simplified procedure and thus it is also 
important to know the limitations of such simplified 
approach. 

Additional material 



The accession numbers are shown in parenthesis. Bold sequences 
indicate those which are used for final framework construction. 

Additional file 7: Phylogenetic tree based on 102, 16S rRNA gene 
sequences of Streptococcus mutans The tree was constructed by 
neighbour-joining method with Jukes and Cantor correction. The 
numbers at node represent bootstrap values (based on 1000 resampling). 
The accession numbers are shown in parenthesis. Bold sequences 
indicate those which are used for final framework construction. 

Additional file 8: Phylogenetic tree based on 23, 16S rRNA gene 
sequences of Streptococcus sobrinus The tree was constructed by 
neighbour-joining method with Jukes and Cantor correction. The 
numbers at node represent bootstrap values (based on 1000 resampling). 
The accession numbers are shown in parenthesis. Bold sequences 
indicate those which are used for final framework construction. 

Additional file 9: Phylogenetic tree based on 28, 16S rRNA gene 
sequences of Streptococcus mitis. The tree was constructed by 
neighbour-joining method with Jukes and Cantor correction. The 
numbers at node represent bootstrap values (based on 1000 resampling). 
The accession numbers are shown in parenthesis. Bold sequences 
indicate those which are used for final framework construction. 

Additional file 10: Phylogenetic tree based on 41, 16S rRNA gene 
sequences of Streptococcus pneumoniae The tree was constructed by 
neighbour-joining method with Jukes and Cantor correction. The 
numbers at node represent bootstrap values (based on 1000 resampling). 
The accession numbers are shown in parenthesis. Bold sequences 
indicate those which are used for final framework construction. 

Additional file 11: Phylogenetic tree based on 32, 16S rRNA gene 
sequences of Streptococcus anginosus The tree was constructed by 
neighbour-joining method with Jukes and Cantor correction. The 
numbers at node represent bootstrap values (based on 1000 resampling). 
The accession numbers are shown in parenthesis. Bold sequences 
indicate those which are used for final framework construction. 

Additional file 12: Phylogenetic tree based on 73, 16S rRNA gene 
sequences of Streptococcus thermophilus The tree was constructed by 
neighbour-joining method with Jukes and Cantor correction. The 
numbers at node represent bootstrap values (based on 1000 resampling). 
The accession numbers are shown in parenthesis. Bold sequences 
indicate those which are used for final framework construction. 

Additional file 13: Dendrogram based on restriction digestion of 12 
Streptococcus framework spp. with Alu\. 

Additional file 14: Dendrogram based on restriction digestion of 12 
Streptococcus framework spp. with Bfa\. 

Additional file 15: Dendrogram based on restriction digestion of 12 
Streptococcus framework spp. with Hoelll 

Additional file 16: Dendrogram based on restriction digestion of 12 
Streptococcus framework spp. with Mspl 

Additional file 17: Dendrogram based on restriction digestion of 12 
Streptococcus framework spp. with fisal 

Additional file 18: Dendrogram based on restriction digestion of 12 
Streptococcus framework spp. with 5au3AI. 
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The accession numbers are shown in parenthesis. Bold sequences 
indicate those which are used for final framework construction. 
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The accession numbers are shown in parenthesis. Bold sequences 
indicate those which are used for final framework construction. 
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The accession numbers are shown in parenthesis. Bold sequences 
indicate those which are used for final framework construction. 
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numbers at node represent bootstrap values (based on 1000 resampling). 
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The accession numbers are shown in parenthesis. Bold sequences 
indicate those which are used for final framework construction. 
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